Machine Translation Model based on
نویسنده
چکیده
Although the parallel corpus has an irreplaceable role in machine translation, its scale and coverage is still beyond the actual needs. Non-parallel corpus resources on the web have an inestimable potential value in machine translation and other natural language processing tasks. This article proposes a semi-supervised transductive learning method for expanding the training corpus in statistical machine translation system by extracting parallel sentences from the non-parallel corpus. This method only requires a small amount of labeled corpus and a large unlabeled corpus to build a high-performance classifier, especially for when there is short of labeled corpus. The experimental results show that by combining the non-parallel corpus alignment and the semi-supervised transductive learning method, we can more effectively use their respective strengths to improve the performance of machine translation system.
منابع مشابه
A new model for persian multi-part words edition based on statistical machine translation
Multi-part words in English language are hyphenated and hyphen is used to separate different parts. Persian language consists of multi-part words as well. Based on Persian morphology, half-space character is needed to separate parts of multi-part words where in many cases people incorrectly use space character instead of half-space character. This common incorrectly use of space leads to some s...
متن کاملA Hybrid Machine Translation System Based on a Monotone Decoder
In this paper, a hybrid Machine Translation (MT) system is proposed by combining the result of a rule-based machine translation (RBMT) system with a statistical approach. The RBMT uses a set of linguistic rules for translation, which leads to better translation results in terms of word ordering and syntactic structure. On the other hand, SMT works better in lexical choice. Therefore, in our sys...
متن کاملA Comparative Study of English-Persian Translation of Neural Google Translation
Many studies abroad have focused on neural machine translation and almost all concluded that this method was much closer to humanistic translation than machine translation. Therefore, this paper aimed at investigating whether neural machine translation was more acceptable in English-Persian translation in comparison with machine translation. Hence, two types of text were chosen to be translated...
متن کاملThe Correlation of Machine Translation Evaluation Metrics with Human Judgement on Persian Language
Machine Translation Evaluation Metrics (MTEMs) are the central core of Machine Translation (MT) engines as they are developed based on frequent evaluation. Although MTEMs are widespread today, their validity and quality for many languages is still under question. The aim of this research study was to examine the validity and assess the quality of MTEMs from Lexical Similarity set on machine tra...
متن کاملTranslation Strategies in English to Persian Translation of Children's Literature based on Klingberg's Model
This research sought to identify the translation strategies adopted by the translator in Persian translation of 'whatever after, Fairest of all' written by 'Sarah Mlynowski' based on Klingberg's model (1986). To achieve the objectives of the study, a qualitative content analysis design was selected for it. The corpus of the study consisted of 60 pages of the novel 'whatever after, Fairest of al...
متن کاملAssessing the Quality of Persian Translation of the Book “Principles of Marketing” Based on the House’s (TQA) Model
Translation is evaluated in terms of its forms and functions inside the historically developed systems of the receiving culture and literature. This study aimed to evaluate the quality of Persian translation of the14th edition of the original English book “Principles of Marketing” written by Philip Kotler and Gary Armstrong based on House (TQA) model: overt and covert translation distinction. T...
متن کامل